Class prediction for high-dimensional class-imbalanced data
نویسندگان
چکیده
منابع مشابه
Software Defect Prediction for High-Dimensional and Class-Imbalanced Data
Software quality and reliability can be improved using various techniques during the software development process. One effective method is to utilize software metrics and defect data collected during the software development life cycle and build defect predictors using data mining techniques to estimate the quality of target program modules. Such a strategy allows practitioners to intelligently...
متن کاملClass-imbalanced classifiers for high-dimensional data
A class-imbalanced classifier is a decision rule to predict the class membership of new samples from an available data set where the class sizes differ considerably. When the class sizes are very different, most standard classification algorithms may favor the larger (majority) class resulting in poor accuracy in the minority class prediction. A class-imbalanced classifier typically modifies a ...
متن کاملMethods for class prediction with high-dimensional gene expression data
An increasing amount of genomic data has become available. The work deals with class prediction with highdimensional gene expression data. Combining gene expression data with other data can improve the prediction of disease prognosis. The main part of the work is aimed at combining gene expression data with clinical data. We use logistic regression models that can be built through various regul...
متن کاملFeature selection for high-dimensional class-imbalanced data sets using Support Vector Machines
Feature selection and classification of imbalanced data sets are two of the most interesting machine learning challenges, attracting a growing attention from both, industry and academia. Feature selection addresses the dimensionality reduction problem by determining a subset of available features to build a good model for classification or prediction, while the class-imbalance problem arises wh...
متن کاملEvaluating Difficulty of Multi-class Imbalanced Data
Multi-class imbalanced classification is more difficult than its binary counterpart. Besides typical data difficulty factors, one should also consider the complexity of relations among classes. This paper introduces a new method for examining the characteristics of multi-class data. It is based on analyzing the neighbourhood of the minority class examples and on additional information about sim...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: BMC Bioinformatics
سال: 2010
ISSN: 1471-2105
DOI: 10.1186/1471-2105-11-523